<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  

  
  <title>Hexo</title>
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
  <meta property="og:type" content="website">
<meta property="og:title" content="Hexo">
<meta property="og:url" content="http://shawn_fighter.gitee.io/shawn/index.html">
<meta property="og:site_name" content="Hexo">
<meta property="og:locale" content="en_US">
<meta property="article:author" content="John Doe">
<meta name="twitter:card" content="summary">
  
    <link rel="alternate" href="/atom.xml" title="Hexo" type="application/atom+xml">
  
  
    <link rel="icon" href="/favicon.png">
  
  
    <link href="//fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
  
  
<link rel="stylesheet" href="/css/style.css">

<meta name="generator" content="Hexo 4.2.1"></head>

<body>
  <div id="container">
    <div id="wrap">
      <header id="header">
  <div id="banner"></div>
  <div id="header-outer" class="outer">
    <div id="header-title" class="inner">
      <h1 id="logo-wrap">
        <a href="/" id="logo">Hexo</a>
      </h1>
      
    </div>
    <div id="header-inner" class="inner">
      <nav id="main-nav">
        <a id="main-nav-toggle" class="nav-icon"></a>
        
          <a class="main-nav-link" href="/">Home</a>
        
          <a class="main-nav-link" href="/archives">Archives</a>
        
      </nav>
      <nav id="sub-nav">
        
          <a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
        
        <a id="nav-search-btn" class="nav-icon" title="Search"></a>
      </nav>
      <div id="search-form-wrap">
        <form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit">&#xF002;</button><input type="hidden" name="sitesearch" value="http://shawn_fighter.gitee.io/shawn"></form>
      </div>
    </div>
  </div>
</header>
      <div class="outer">
        <section id="main">
  
    <article id="post-kettle入门篇(一)" class="article article-type-post" itemscope itemprop="blogPost">
  <div class="article-meta">
    <a href="/2020/07/13/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/" class="article-date">
  <time datetime="2020-07-12T16:15:32.000Z" itemprop="datePublished">2020-07-13</time>
</a>
    
  </div>
  <div class="article-inner">
    
    
      <header class="article-header">
        
  
    <h1 itemprop="name">
      <a class="article-title" href="/2020/07/13/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/">kettle入门篇(一)</a>
    </h1>
  

      </header>
    
    <div class="article-entry" itemprop="articleBody">
      
        <h1 id="简介"><a href="#简介" class="headerlink" title="简介"></a>简介</h1><p>Kettle是一款国外开源的ETL工具，纯java编写，可以在Windows、Linux、Unix上运行，数据抽取高效稳定。</p>
<p>Kettle 中文名称叫水壶，该项目的主程序员MATT 希望把各种数据放到一个壶里，然后以一种指定的格式流出。</p>
<p>Kettle这个ETL工具集，它允许你管理来自不同数据库的数据，通过提供一个图形化的用户环境来描述你想做什么，而不是你想怎么做。<br>Kettle中有两种脚本文件，transformation和job，transformation完成针对数据的基础转换，job则完成整个工作流的控制。</p>
<h1 id="安装"><a href="#安装" class="headerlink" title="安装"></a>安装</h1><ol>
<li><p>官网各版本下载网址：<a href="https://sourceforge.net/projects/pentaho/files/" target="_blank" rel="noopener">https://sourceforge.net/projects/pentaho/files/</a></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/kettle%E5%90%84%E7%89%88%E6%9C%AC%E4%B8%8B%E8%BD%BD%E5%9C%B0%E5%9D%80%E7%95%8C%E9%9D%A2.png" alt="kettle各版本下载地址界面"></p>
</li>
<li><p>点击<code>Pentaho 9.0</code>,并选择<code>client-tools</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/Pentaho9.0.png" alt="Pentaho9.0"></p>
</li>
<li><p>点击<code>pdi-ce-9.0.0.0-423.zip</code>,下载好解压即可</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/pdi-ce-9.0.0.0-423.zip.png" alt="pdi-ce-9.0.0.0-423.zip"></p>
</li>
<li><p>由于Kettle是基于jdk环境运行，所以需要安装jdk，最小安装jdk1.8。</p>
</li>
<li><p>解压以后，需要配置环境变量，KETTLE_HOME</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/kettle_home.png" alt="kettle_home"></p>
</li>
</ol>
<h1 id="目录结构"><a href="#目录结构" class="headerlink" title="目录结构"></a>目录结构</h1><p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/kettle%E7%9B%AE%E5%BD%95%E7%BB%93%E6%9E%84.png" alt="kettle目录结构"></p>
<p>主要介绍：</p>
<ol>
<li>lib: kettle运行需要的jar包，例如连接数据库资源库的时候，需要数据库驱动jar包。</li>
<li>libswt: kettle ui界面需要的jar包，分为linux,os,win32,win64</li>
<li>plugins：kettle是插件式开发模式，可以开发kettle plugin。例如big data plugin，json plugin等等。</li>
<li>pwd: 部署集群的时候需要</li>
<li>simples: kettle一些案例</li>
<li>ui: 控制kettle ui组件的显示</li>
<li>spoon: 允许你通过图形界面来设计ETL转换过程（Transformation）。</li>
<li>pan: 允许你批量运行由Spoon设计的ETL转换 (例如使用一个时间调度器)。Pan是一个后台执行的程序，没有图形界面。</li>
<li>CHEF： 任务通过允许每个转换，任务，脚本等等，更有利于自动化更新数据仓库的复杂工作。任务通过允许每个转换，任务，脚本等等。任务将会被检查，看看是否正确地运行了。</li>
<li>kitchen: 允许你批量使用由Chef设计的任务 (例如使用一个时间调度器)。KITCHEN也是一个后台运行的程序。</li>
</ol>
<h1 id="连接资源库"><a href="#连接资源库" class="headerlink" title="连接资源库"></a>连接资源库</h1><ol>
<li><p>检查是否有数据库驱动jar包，如果没有，先copy一份数据库驱动jar包到lib目录下，例如mysql数据库，需要<code>mysql-connector-java-5.1.46.jar</code></p>
</li>
<li><p>在数据库中创建数据库<code>kettle_repository</code></p>
</li>
<li><p>点击<code>spoon.bat</code>，打开了以后点击connect,点击repository manager</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/spoon.png" alt="spoon"></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/repository-managment.png" alt="repository-managment"></p>
</li>
<li><p>点击<code>Add</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="add repository">/add repository.png)</p>
</li>
<li><p>点击<code>other repository</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/other-repository.png" alt=""></p>
</li>
<li><p>选择<code>database repository</code>然后点击<code>get started</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="">/choose database repository.png)</p>
</li>
<li><p>填写<code>display name</code>，选择<code>database connection</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="datasource connection">/choose datasource connection.png)</p>
</li>
<li><p>点击<code>new</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="add datasource connection">/add datasource connection.png)</p>
</li>
<li><p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="add connection">/add connection.png)</p>
</li>
<li><p>然后一直点击<code>back</code>，直到第七个步骤的界面，点击finish</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="finish add connection">/finish add connection.png)</p>
<p>当出现以下见面的时候，表示kettle正在创建所需要的表。</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="almost done">/almost done.png)</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="kettle tables">/kettle tables.png)（表截图截的不全）</p>
</li>
<li><p>点击<code>connect now</code>，<code>user name: admin, password: admin</code></p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="">/connect now.png)</p>
<p><img src="/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80" alt="kettle login">/kettle login.png)</p>
</li>
</ol>

      
    </div>
    <footer class="article-footer">
      <a data-url="http://shawn_fighter.gitee.io/shawn/2020/07/13/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/" data-id="ckcjbcq8o0000ewulh6wl21c1" class="article-share-link">Share</a>
      
      
    </footer>
  </div>
  
</article>


  


</section>
        
          <aside id="sidebar">
  
    

  
    

  
    
  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Archives</h3>
    <div class="widget">
      <ul class="archive-list"><li class="archive-list-item"><a class="archive-list-link" href="/archives/2020/07/">July 2020</a></li></ul>
    </div>
  </div>


  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Recent Posts</h3>
    <div class="widget">
      <ul>
        
          <li>
            <a href="/2020/07/13/kettle%E5%85%A5%E9%97%A8%E7%AF%87(%E4%B8%80)/">kettle入门篇(一)</a>
          </li>
        
      </ul>
    </div>
  </div>

  
</aside>
        
      </div>
      <footer id="footer">
  
  <div class="outer">
    <div id="footer-info" class="inner">
      &copy; 2020 John Doe<br>
      Powered by <a href="http://hexo.io/" target="_blank">Hexo</a>
    </div>
  </div>
</footer>
    </div>
    <nav id="mobile-nav">
  
    <a href="/" class="mobile-nav-link">Home</a>
  
    <a href="/archives" class="mobile-nav-link">Archives</a>
  
</nav>
    

<script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>


  
<link rel="stylesheet" href="/fancybox/jquery.fancybox.css">

  
<script src="/fancybox/jquery.fancybox.pack.js"></script>




<script src="/js/script.js"></script>




  </div>
</body>
</html>